Search results for "Regular expression"
showing 10 items of 13 documents
Noise-tolerant efficient inductive synthesis of regular expressions from good examples
1997
We present an almost linear time method of inductive synthesis restoring simple regular expressions from one representative (good) example. In particular, we consider synthesis of expressions of star-height one, where we allow one union operation under each iteration, and synthesis of expressions without union operations from examples that may contain mistakes. In both cases we provide sufficient conditions defining precisely the class of target expressions and the notion of good examples under which the synthesis algorithm works correctly, and present the proof of correctness. In the case of expressions with unions the proof is based on novel results in the combinatorics of words. A genera…
Efficient learning of regular expressions from good examples
1994
We consider the problem of restoring regular expressions from expressive examples. We define the class of unambiguous regular expressions, the notion of the union number of an expression showing how many union operations can occur directly under any single iteration, and the notion of an expressive example. We present a polynomial time algorithm which tries to restore an unambiguous regular expression from one expressive example. We prove that if the union number of the expression is 0 or 1 and the example is long enough, then the algorithm correctly restores the original expression from one good example. The proof relies on original investigations in theory of covering symbol sequences (wo…
On languages factorizing the free monoid
1996
A language X⊂A* is called factorizing if there exists a language Y⊂A* such that XY = A* This work was partially supported by ESPRIT-EBRA project ASMICS contact 6317 and project 40% MURST “Algoritmi, Modelli di Calcolo e Strutture Informative”. and the product is unambiguous. First we give a combinatorial characterization of factorizing languages. Further we prove that it is decidable whether a regular language X is factorizing and we construct an automaton recognizing the corresponding language Y. For finite languages we show that it suffices to consider words of bounded length. A complete characterization of factorizing languages with three words and explicit regular expression for the co…
Longest Motifs with a Functionally Equivalent Central Block
2004
International audience; This paper presents a generalization of the notion of longest repeats with a block of k don't care symbols introduced by [Crochemore et al., LATIN 2004] (for k fixed) to longest motifs composed of three parts: a first and last that parameterize match (that is, match via some symbol renaming, initially unknown), and a functionally equivalent central block. Such three-part motifs are called longest block motifs. Different types of functional equivalence, and thus of matching criteria for the central block are considered, which include as a subcase the one treated in [Crochemore et al., LATIN 2004] and extend to the case of regular expressions with no Kleene closure or …
Testing Grammars for Parsability
1990
In the preceding chapters we have studied in detail the major methods of deterministic context-free parsing: strong LL(k) parsing (Chapter 5), simple precedence parsing (Chapter 5), canonical LR(k) parsing, LALR(k) parsing, and SLR(k) parsing (Chapters 6 and 7), and canonical LL(k) parsing (Chapter 8). Each of these methods induces a class of grammars that are “parsable” using that method, that is, a class of grammars for which a deterministic parser employing that method can be constructed. For example, the LL(k) grammars constitute the class of grammars parsable by the LL(k) parsing method. By definition, a context-free grammar is an LL(k) grammar if and only if its canonical LL(k) parser…
Efficient algorithm for learning simple regular expressions from noisy examples
1994
We present an efficient algorithm for finding approximate repetitions in a given sequence of characters. First, we define a class of simple regular expressions which are of star-height one and do not contain union operations, and a stochastic mutation process of a given length over a string of characters. Then, assuming that a given string of characters is obtained corrupted by the defined mutation process from some long enough word generated by a simple regular expression, we try to restore the expression. We prove that to within some reasonable accuracy it is always possible if the length of the mutation process is bounded comparing to the length of the example. We provide an algorithm by…
The Star Height One Problem for Irreducible Automata
1993
The star height of a regular expression is, informally, the maximum number of nested stars in the expression. The star height of a regular language is the minimal star height of a regular expression denoting this language. The notion of star height indicates in a certain sense the “loop complexity” of a regular expression and thus it gives a measure of the complexity of a regular language.
Ambiguity and complementation in recognizable two-dimensional languages
2008
The theory of one-dimensional (word) languages is well founded and investigated since fifties. From several years, the increasing interest for pattern recognition and image processing motivated the research on two-dimensional or picture languages, and nowadays this is a research field of great interest. A first attempt to formalize the concept of finite state recognizability for two-dimensional languages can be attributed to Blum and Hewitt ([7]) who started in 1967 the study of finite state devices that can define two-dimensional languages, with the aim to finding a counterpart of what regular languages are in one dimension. Since then, many approaches have been presented in the literature…
Learning a class of regular expressions via restricted subset queries
1992
A wide class of regular expressions non-representable as unions of “smaller” expressions is shown to be polynomial-time learnable via restricted subset queries from arbitrary representative examples “reflecting” the loop structure and a way the input example is obtained from the unknown expression. The corresponding subclass of regular expressions of loop depth at most 1 is shown to be learnable from representative examples via membership queries. A wide class of expressions with loops A+ of arbitrary loop depth is shown to be learnable via restricted subset queries from arbitrary examples.
Learning of regular expressions by pattern matching
1995
We consider the problem of restoring regular expressions from good examples. We describe a natural learning algorithm for obtaining a “plausible” regular expression from one example. The algorithm is based on finding the longest substring which can be matched by some part of the so far obtained expression. We believe that the algorithm to a certain extent mimics humans guessing regular expressions from the same sort of examples. We show that for regular expressions of bounded length successful learning takes time linear in the length of the example, provided that the example is “good”. Under certain natural restrictions the run-time of the learning algorithm is polynomial also in unsuccessf…